High Throughput Cell-Based Screening for Aptamers

ABSTRACT

The invention provides eukaryotic cell-based screening methods to identify an aptamer that specifically binds a ligand, or a ligand that specifically binds an aptamer, using a polynucleotide cassette for the regulation of the expression of a reporter gene where the polynucleotide cassette contains a riboswitch in the context of a 5′ intron-alternative exon-3′ intron. The riboswitch comprises an effector region and an aptamer such that when the aptamer binds a ligand, reporter gene expression occurs.

FIELD OF THE INVENTION

The invention provides screening methods to identify an aptamer thatspecifically binds a ligand, or a ligand that specifically binds anaptamer, in a eukaryotic cell using a polynucleotide cassette for theregulation of the expression of a reporter gene where the polynucleotidecassette contains a riboswitch in the context of a 5′ intron-alternativeexon-3′ intron. The riboswitch comprises an effector region and anaptamer such that when the aptamer binds a ligand, reporter geneexpression occurs.

BACKGROUND OF THE INVENTION

Splicing refers to the process by which intronic sequence is removedfrom the nascent pre-messenger RNA (pre-mRNA) and the exons are joinedtogether to form the mRNA. Splice sites are junctions between exons andintrons, and are defined by different consensus sequences at the 5′ and3′ ends of the intron (i.e., the splice donor and splice acceptor sites,respectively). Alternative pre-mRNA splicing, or alternative splicing,is a widespread process occurring in most human genes containingmultiple exons. It is carried out by a large multi-component structurecalled the spliceosome, which is a collection of small nuclearribonucleoproteins (snRNPs) and a diverse array of auxiliary proteins.By recognizing various cis regulatory sequences, the spliceosome definesexon/intron boundaries, removes intronic sequences, and splices togetherthe exons into a final translatable message (i.e., the mRNA). In thecase of alternative splicing, certain exons can be included or excludedto vary the final coding message thereby changing the resultingexpressed protein.

The present invention utilizes ligand/aptamer-mediated control ofalternative splicing to identify aptamer/ligand pairs that bind in thecontext of a target eukaryotic cell. Prior to the present invention,aptamers have been generated against a variety of ligands through invitro screening, however, few have proved to be effective in cells,highlighting a need for systems to screen aptamers that function in theorganism of choice.

SUMMARY OF THE INVENTION

In one aspect, the present invention provides a method for selecting anaptamer that binds a ligand in eukaryotic cells comprising the steps of:

-   -   (a) providing a library of aptamers,    -   (b) introducing members of the library of aptamers into a        polynucleotide cassette for the ligand-mediated expression of a        reporter gene to create a library of riboswitches,    -   (c) introducing the library of riboswitches into eukaryotic        cells, and    -   (d) contacting the eukaryotic cells with a ligand, and    -   (e) measuring expression of the reporter gene,        wherein the polynucleotide cassette comprises an        alternatively-spliced exon, flanked by a 5′ intron and a 3′        intron, and a riboswitch comprising (i) an effector region        comprising a stem that includes the 5′ splice site of the 3′        intron, and (ii) an aptamer, wherein the alternatively-spliced        exon comprises a stop codon that is in-frame with the reporter        gene when the alternatively-spliced exon is spliced into the        reporter gene mRNA.

In one embodiment, the library of aptamers comprises aptamers having oneor more randomized nucleotides. In one embodiment, the library ofaptamers comprises aptamers having fully randomized sequences. In oneembodiment, the library of aptamers comprises aptamers that are betweenabout 15 to about 200 nucleotides in length. In one embodiment, thelibrary of aptamers comprises aptamers that are between about 30 andabout 100 nucleotides in length. In one embodiment, the library ofaptamers comprises more than 100,000 aptamers. In one embodiment, thelibrary of aptamers comprises more than 1,000,000 aptamers.

In one embodiment, the ligand is a small molecule. In one embodiment,the small molecule ligand is exogenous to the eukaryotic cell. Inanother embodiment, the ligand is a molecule produced by the eukaryoticcell including, e.g., a metabolite, nucleic acid, vitamin, co-factor,lipid, monosaccharide, and second messenger.

In one embodiment, the eukaryotic cell is selected from a mammaliancell, an insect cell, a plant cell, and a yeast cell. In one embodiment,the eukaryotic cell is derived from a mouse, a human, a fly (e.g.,Drosophila melanogaster), a fish (e.g., Danio rerio) or a nematode worm(e.g., Caenorhabditis elegans).

In one embodiment, the reporter gene is selected from the groupconsisting of a fluorescent protein, luciferase, β-galactosidase andhorseradish peroxidase. In one embodiment, the reporter gene is acytokine, a signaling molecule, a growth hormone, an antibody, aregulatory RNA, a therapeutic protein, or a peptide. In one embodiment,the expression of the reporter gene is greater than about 10-fold higherwhen the ligand specifically binds the aptamer than the reporter geneexpression levels when the ligand is absent. In further embodiments, theexpression of the reporter gene is greater than about 20, 50, 100, 200,500, or 1,000-fold higher when the ligand specifically binds the aptamerthan the reporter gene expression levels when the ligand is absent.

In one embodiment, the 5′ and 3′ introns are derived from intron 2 ofthe human β-globin gene. In one embodiment, the 5′ intron comprises astop codon in-frame with the target gene. In one embodiment, the 5′ and3′ introns are each independently from about 50 to about 300 nucleotidesin length. In one embodiment, the 5′ and 3′ introns are eachindependently from about 125 to about 240 nucleotides in length. In oneembodiment, the 5′ and/or 3′ introns have been modified to include, oralter the sequence of, an intron splice enhancer, an intron spliceenhancer, a 5′ splice site, a 3′ splice site, or the branch pointsequence.

In one embodiment, the effector region stem of the riboswitch is about 7to about 20 base pairs in length. In one embodiment, the effector regionstem is 8 to 11 base pairs in length.

In one embodiment, the alternatively-spliced exon is derived from exon 2of the human dihydrofolate reductase gene (DHFR), mutant human Wilmstumor 1 exon 5, mouse calcium/calmodulin-dependent protein kinase IIdelta exon 16, or SIRT1 exon 6. In one embodiment, thealternatively-spliced exon is the modified DHFR exon 2. In oneembodiment, the alternatively-spliced exon has been modified in one ormore of the group consisting of altering the sequence of an exon splicesilencer, altering the sequence of an exon splice enhancer, adding anexon splice enhancer, and adding an exon splice donor. In oneembodiment, the alternatively-spliced exon is synthetic (i.e., notderived from a naturally-occurring exon).

In one embodiment, the library of aptamers is divided into a smalleraptamer library before introducing into the polynucleotide cassettescomprising the steps:

-   -   (a) providing a randomized aptamer library wherein the aptamers        in the library comprise multiple 5′ and 3′ constant regions and        one or more randomized nucleotides,    -   (b) performing a two-cycle PCR using the randomized aptamer        library as the template and a first primer and second primer        that are complementary to the 5′ and 3′ constant regions,    -   (c) isolating the products of the two-cycle PCR, and    -   (d) PCR amplifying a subset of the isolated products of the        two-cycle PCR using multiple of primers complementary to a        subset of the unique 5′ and 3′ constant regions.

In one embodiment, the library of riboswitches is divided into one ormore sub-libraries of riboswitches before being introduced into theeukaryotic cells. In one embodiment, the method for dividing theriboswitch library into sub-libraries comprises the steps of:

(a) introducing a library of aptamers into a plasmid comprising a generegulation polynucleotide cassette to make riboswitch library;

(b) introducing the riboswitch library into bacteria (e.g., E. coli);and

(c) collecting bacterial clones (for example by picking bacterialcolonies) and extracting plasmid DNA to obtain plasmid sub-libraries ofriboswitches (referred to herein as primary sub-libraries);

In embodiments, secondary sub-libraries of riboswitches are generatedfrom a primary plasmid sub-library of riboswitches by introducing aprimary sub-library into bacteria, collecting bacterial clones andisolating the plasmid DNA. The primary or secondary sub-library are thenintroduced into eukaryotic cells, the eukaryotic cells contacted with aligand, and expression of the reporter gene measured to determinewhether one or more aptamers in the library bind the ligand in thecontext of the eukaryotic cell.

In one embodiment, the present invention includes an aptamer that bindsa target ligand wherein the aptamer is selected by the above methods. Inembodiments of the invention, the aptamer comprises the sequence of SEQID NO: 14 to 27. In one embodiment, the aptamer sequence comprises thesequence of SEQ ID NO: 24.

In another aspect, the invention provides a method for selecting aligand that binds an aptamer in a eukaryotic cell comprising the stepsof:

-   -   (a) providing a library of ligands,    -   (b) providing a polynucleotide cassette for the ligand-mediated        expression of a reporter gene,    -   (c) introducing the polynucleotide cassette into the eukaryotic        cell,    -   (d) contacting individual groups of the eukaryotic cell with        members of the library of ligands, and    -   (e) measuring the expression of the reporter gene,        wherein the polynucleotide cassette comprises an        alternatively-spliced exon, flanked by a 5′ intron and a 3′        intron, and a riboswitch comprising (i) an effector region        comprising a stem that includes the 5′ splice site of the 3′        intron, and (ii) an aptamer, wherein the alternatively-spliced        exon comprises a stop codon that is in-frame with the reporter        gene when the alternatively-spliced exon is spliced into the        reporter gene mRNA.

In one embodiment, the ligand is a small molecule. In one embodiment,the small molecule ligand is exogenous to the eukaryotic cell. Inanother embodiment, the ligand is a molecule produced by the eukaryoticcell including, e.g., a metabolite, nucleic acid, vitamin, co-factor,lipid, monosaccharide, and second messenger.

In one embodiment, the eukaryotic cell is selected from a mammaliancell, an insect cell, a plant cell, and a yeast cell. In one embodimentthe eukaryotic cell is derived from a mouse, a human, a fly (e.g.,Drosophila melanogaster), a fish (e.g., Danio rerio) or a nematode worm(e.g., Caenorhabditis elegans).

In one embodiment, the reporter gene is selected from the groupconsisting of a fluorescent protein, luciferase, β-galactosidase andhorseradish peroxidase. In one embodiment the reporter gene is acytokine, a signaling molecule, a growth hormone, an antibody, aregulatory RNA, a therapeutic protein, or a peptide. In one embodiment,the expression of the reporter gene is greater than about 10-fold higherwhen the ligand specifically binds the aptamer than the reporter geneexpression levels when the ligand is absent. In further embodiments, theexpression of the reporter gene is greater than about 20, 50, 100, 200,500, or 1,000-fold higher when the ligand specifically binds the aptamerthan the reporter gene expression levels when the ligand is absent.

In one embodiment, the 5′ and 3′ introns are derived from intron 2 ofthe human β-globin gene. In one embodiment, the 5′ intron comprises astop codon in-frame with the target gene. In one embodiment, the 5′ and3′ introns are each independently from about 50 to about 300 nucleotidesin length. In one embodiment, the 5′ and 3′ introns are eachindependently from about 125 to about 240 nucleotides in length. In oneembodiment, the 5′ and/or 3′ introns have been modified to include, oralter the sequence of, an intron splice enhancer, an exon spliceenhancer, a 5′ splice site, a 3′ splice site, or the branch pointsequence.

In one embodiment, the effector region stem of the riboswitch is about 7to about 20 base pairs in length. In one embodiment, the effector regionstem is 8 to 11 base pairs in length.

In one embodiment, the alternatively-spliced exon is derived from exon 2of the human dihydrofolate reductase gene (DHFR), mutant human Wilmstumor 1 exon 5, mouse calcium/calmodulin-dependent protein kinase IIdelta exon 16, or SIRT1 exon 6. In one embodiment, thealternatively-spliced exon is a modified DHFR exon 2. In one embodiment,the alternatively-spliced exon has been modified in one or more of thegroup consisting of altering the sequence of an exon splice silencer,altering the sequence of an exon splice enhancer, adding an exon spliceenhancer, and adding an exon splice donor. In one embodiment, thealternatively-spliced exon is synthetic (i.e., not derived from anaturally-occurring exon).

In one embodiment, the present invention includes a ligand selected bythe above methods.

In another aspect the invention provides a method for splitting arandomized aptamer library into smaller aptamer sub-libraries comprisingthe steps:

-   -   (a) providing a randomized aptamer library wherein the aptamers        in the library comprise multiple 5′ and 3′ constant regions and        one or more randomized nucleotides,    -   (b) performing a two-cycle PCR using the randomized aptamer        library as the template and first primers and second primers        that are complementary to the 5′ and 3′ constant regions,    -   (c) isolating the products of the two-cycle PCR, and    -   (d) PCR amplifying a subset of the isolated products of the        two-cycle PCR using primers complementary to a subset of the        unique 5′ and 3′ constant regions.

In one embodiment, the randomized aptamer library comprises aptamershaving one or more randomized nucleotides. In one embodiment, therandomized aptamer library comprises more than about 100,000 aptamers.In one embodiment, the randomized aptamer library comprises more thanabout 1,000,000 aptamers.

In one embodiment, the first or second primer in the two-cycle PCRcomprises a label selected from the group consisting of biotin,digoxigenin (DIG), bromodeoxyuridine (BrdU), fluorophore, a chemicalgroup, e.g. thiol group, or a chemical group e.g. azides used in ClickChemistry

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 a . Schematic of the riboswitch construct. A truncatedbeta-globin intron sequence was inserted in the coding sequence of thereporter gene, and a mutant, stop-codon containing DHFR exon 2 (mDHFR)was placed in the inserted intron, thus forming a three-exon geneexpression platform by which the reporter gene expression is regulatedby inclusion/exclusion of the mDHFR exon. A hairpin/stem structure isformed including the U1 binding site in the intron downstream (3′) ofthe mDHFR exon with the engineered sequence complementary to the U1binding site, which blocks the U1 binding, thereby leading to theexclusion of stop-codon containing mDHFR exon and target geneexpression. The aptamer sequence is grafted in between the U1 bindingsite and its complementary sequence, allowing the control of hairpinformation by aptamer/ligand binding.

FIG. 1 b . Dose responses of constructs with regulatory cassettescontaining different aptamer based riboswitches. Guanine riboswitchesinduced reporter gene expression by responding not only to guanine butalso guanosine treatment.

FIGS. 1 c and 1 d . Graph demonstrating that the xpt-G17 riboswitchinduces luciferase activity upon treatment with guanine analogs.

FIG. 1 e . Fold induction of luciferase activity by xpt-G17 riboswitchupon treatment with compounds.

FIG. 2 . Schematic of a template for generating randomized aptamersequences. The aptamer sequence (blank bar) is flanked by constantregions (black bars), which contain BsaI site to facilitate the cloningof aptamer into a gene regulation cassette to generate riboswitches.

FIGS. 3 a to 3 e . Schematic description of the method for splittinglarge randomized aptamer library to smaller sub-libraries.

FIG. 3 a . The schematic diagram of the two-step strategy for splittinga large aptamer library. The first step is to add a unique pair ofsequence tags to each aptamer oligonucleotide template. Following thefirst step, templates with unique tag sequences are amplified usingprimers that are specific to tagged sequences.

FIG. 3 b . Three approaches to attaching tag sequences to templates: tagsequences incorporated through PCR using primers that contain tagsequences at the 5′ end of primers (I); tag sequences attached byligating single stranded template sequence with single stranded tagsequences by T4 RNA ligase (II); tag sequences linked to templates byligating double stranded template sequences with double stranded tagsequence by T4 DNA ligase (III).

FIG. 3 c . Schematic diagram of two-cycle PCR. For cycle 1, only reverseprimers JR which contain tag sequence at 5′ end. After the first cycle,the newly synthesized strand has a sequence tag at its 5′ end. For cycle2, biotin labeled forward primer JF is added to the PCR reaction, whichcan only use the newly synthesized strand as template, thus generatingthe templates with tag sequences at both 5′ and 3′ ends and a biotinmolecule at 5′ end.

FIG. 3 d . Generation of tagged aptamer library. After labelingtemplates with sequence tags and biotin molecule, streptavidin beads areused to separate the labeled/tagged single stranded templates from therest of the reaction components through denaturing the oligos and beadswashing. Then the tagged templates are amplified and expanded using amixture of primers (F and R primers) that are specific to the taggedsequences, thus generating tagged aptamer library that are ready forsubsequent PCR using a single pair of tag sequence-specific primers togenerate sub-libraries of the original aptamer library.

FIG. 3 e . Sub-libraries of aptamers are PCR amplified using thesplitting strategy. Aptamer library (10⁶, generated as in Example 2) wastagged by PCR using 2 forward primers (JF1-2) and 8 reverse primers(JR1-8), with template copy number at 1, 2.3 or 4.6. The isolated taggedtemplates were expanded by a mixture of tag-specific primers F1-2 andR1-8, and the PCR products were subject to PCR with either universalprimers (left panel), single pair of tag-specific primers F1 and R1(middle panel), or single pair non-relevant primers of F3 and R1(rightpanel). Water was used as blank control for templates.

FIG. 4 . Sensitivity test on cell-based assay for riboswitch libraryscreening. Construct xpt-G17 was mixed with construct SR-mut atdifferent molecular ratios, and the mixed construct DNA was transfectedinto HEK-293 cells and treated with guanine. The fold induction ofluciferase activity was calculated as luciferase activity induced withguanine divided by luciferase activity obtained without guaninetreatment.

FIG. 5 a . Schematic diagram of construction of a plasmid librarycontaining riboswitch. The single stranded aptamer oligos are first PCRamplified using universal primers to convert single stranded aptamertemplate to double stranded. The double stranded oligos are thendigested with BsaI and ligated to BsaI-digested vector to generateconstructs with riboswitches. The plasmid DNA is then electroporatedinto electro-competent DH5a cells. More than 5×10⁶ colonies arecollected to cover more than 99% of the initial aptamer library (10⁶).

FIG. 5 b . Schematic diagram of dividing plasmid library of riboswitchesto sub-libraries. Plasmid library of riboswitches is transformed intochemically competent DH5a cells. Then transformed bacteria are platedinto agar plates. Certain numbers of bacterial colonies are collectedfrom each individual agar plates and plasmid DNA is extracted fromindividual colony collection separately. The obtained plasmid DNA fromeach collection of colonies forms the sub-library of riboswitch. Thedividing approach can be repeated until desired size of sub-libraries isachieved.

FIG. 5 c . Unique sequence composition of secondary sub-libraries of theriboswitch determined by Next Generation Sequencing. Sequences with morethan 12 reads from the sequencing run were considered true sequences.

FIG. 5 d . Comparison of unique sequence composition between twosecondary sub-libraries that are generated from the same primarysub-library P1S_003. A pie chart indicates the number of uniquesequences in each sub-library and the number of the overlappingsequences between the two libraries of riboswitches.

FIG. 5 e . Comparison of unique sequence composition between twosecondary sub-libraries that are generated from different primarysub-libraries, P1S_003 and P1S_007, respectively. A pie chart indicatesthe number of unique sequences in each sub-library and the number of theoverlapping sequences between the two libraries of riboswitch.

FIGS. 6 a and 6 b . Plasmid DNA from 6 out of 100 primary sub-libraries(60 k) (FIG. 6 a ) or 100 secondary sub-libraries (size of 600) (FIG. 6b ) was arrayed in the format of 96-well plate, and transfected intoHEK-293 cells. The fold induction of luciferase activity was calculatedas luciferase activity induced with guanine divided by luciferaseactivity obtained without guanine treatment.

FIG. 6 c . Riboswitch sub-library screening results using nicotinamideadenine dinucleotide (NAD+) as ligand. The sub-libraries of P2riboswitch library were arrayed in 96-well format. HEK 293 cells wereplated in 96-well plate and transfected with riboswitch library DNAs.Four hours after transfection, cells were treated with 100 μM NAD+.Luciferase activity was measured 20 hours after NAD+ treatment. The foldinduction was calculated as the ratio of the luciferase activityobtained from NAD+ treated cells divided by luciferase activity obtainedfrom cells without NAD+ treatment. Each dot in the dot plot representsthe fold induction from a sub-library or G17 construct as indicated.

FIG. 6 d . Riboswitch screening results using NAD+ as ligand. Eachindividual riboswitch construct was arrayed in 96-well format. HEK 293cells were plated in 96-well plate and transfected with riboswitchconstructs. 4 hours after transfection, cells were treated with 100 μMNAD+. Luciferase activity was measured 20 hours after NAD+ treatment.The fold induction was calculated as the ratio of the luciferaseactivity obtained from NAD+ treated cells divided by luciferase activityobtained from cells without NAD+ treatment. Each dot in the dot plotrepresents the fold induction from each single riboswitch construct orG17 construct as indicated.

FIGS. 6 e and 6 f . Construct with new aptamer sequence show enhancedresponse to NAD+ treatment in a dose dependent manner compared to theG17 riboswitch. HEK 293 cells were transfected with the G17 or construct#46 with new aptamer sequence. 4 hours after transfection, cells weretreated with different doses of NAD+. Luciferase activity was measured20 hours after NAD+ treatment. The fold induction was calculated as theratio of the luciferase activity obtained from NAD+ treated cellsdivided by luciferase activity obtained from cells without NAD+treatment.

DETAILED DESCRIPTION OF THE INVENTION

Methods of Screening Aptamer/Ligand

The present invention provides screening methods to identify aptamersthat bind to a ligand, and ligands that bind to an aptamer, in thecontext of a eukaryotic cell, tissue, or organism. In one aspect, thepresent invention provides a method for selecting an aptamer that bindsa ligand in eukaryotic cells comprising the steps of:

-   -   (a) providing a library of aptamers,    -   (b) introducing members of the library of aptamers into        polynucleotide cassettes for the ligand-mediated expression of a        reporter gene,    -   (c) introducing the aptamer containing polynucleotide cassettes        into eukaryotic cells, and    -   (d) contacting the eukaryotic cells with a ligand, and    -   (e) measuring expression of the reporter gene.

In another aspect, the invention provides a method for selecting aligand that binds an aptamer in a eukaryotic cell comprising the stepsof:

-   -   (a) providing a library of ligands,    -   (b) providing a polynucleotide cassette for the ligand-mediated        expression of a reporter gene,    -   (c) introducing the polynucleotide cassette into the eukaryotic        cell,    -   (d) contacting individual groups of the eukaryotic cell with        members of the library of ligands, and    -   (e) measuring the expression of the reporter gene.

In one embodiment, the invention provides methods to identify aptamersthat bind to intracellular molecules comprising the steps of:

-   -   (a) providing a library of aptamers,    -   (b) introducing members of the library of aptamers into        polynucleotide cassettes for the ligand-mediated expression of a        reporter gene,    -   (c) introducing the aptamer containing polynucleotide cassettes        into eukaryotic cells, and    -   (d) measuring expression of the reporter gene.

The screening methods of the present invention utilize the generegulation polynucleotide cassettes disclosed in PCT/US2016/016234,which is incorporated in its entirety herein by reference. These generegulation cassettes comprise a riboswitch in the context of a 5′intron-alternative exon-3′ intron. The gene regulation cassette refersto a recombinant DNA construct that, when incorporated into the DNA of atarget gene (e.g., a reporter gene), provides the ability to regulateexpression of the target gene by aptamer/ligand mediated alternativesplicing of the resulting pre-mRNA. The gene regulation cassette furthercomprises a riboswitch containing a sensor region (e.g., an aptamer) andan effector region that together are responsible for sensing thepresence of a ligand that binds the aptamer and altering splicing to analternative exon. These aptamer-driven riboswitches provide regulationof mammalian gene expression at a 2- to 2000-fold induction, inresponding to treatment with the ligand that binds the aptamer. Theunprecedented high dynamic regulatory range of this synthetic riboswitchis used in methods of the present invention to provide screening systemsfor new aptamers against desired types of ligands, as well as foroptimal ligands against known and novel aptamers in cells, tissues andorganisms.

Riboswitch

The term “riboswitch” as used herein refers to a regulatory segment of aRNA polynucleotide (or the DNA encoding the RNA polynucleotide). Ariboswitch in the context of the present invention contains a sensorregion (e.g., an aptamer) and an effector region that together areresponsible for sensing the presence of a ligand (e.g., a smallmolecule) and altering splicing to an alternative exon. In oneembodiment, the riboswitch is recombinant, utilizing polynucleotidesfrom two or more sources. The term “synthetic” as used herein in thecontext of a riboswitch refers to a riboswitch that is not naturallyoccurring. In one embodiment, the sensor and effector regions are joinedby a polynucleotide linker. In one embodiment, the polynucleotide linkerforms a RNA stem (i.e., a region of the RNA polynucleotide that isdouble-stranded).

A library of riboswitches as described herein comprise a plurality ofaptamer sequences that differ by one or more nucleotides in the contextof the polynucleotide cassettes for the ligand-mediated expression of areporter gene. Thus, each aptamer in the library, along with a sensorregion, is in the context of a 5′ intron-alternative exon-3′ intron asdescribed herein.

Effector Region

In one embodiment, the effector region comprises the 5′ splice site (“5′ss”) sequence of the 3′ intron (i.e., the intronic splice site sequencethat is immediately 3′ of the alternative exon). The effector regioncomprises the 5′ ss sequence of the 3′ intron and sequence complimentaryto the 5′ ss sequence of the 3′ intron. When the aptamer binds itsligand, the effector region forms a stem and thus prevents splicing tothe splice donor site at the 3′ end of the alternative exon. Undercertain conditions (for example, when the aptamer is not bound to itsligand), the effector region is in a context that provides access to thesplice donor site at the 3′ end of the alternative exon leading toinclusion of the alternative exon in the target gene mRNA.

The stem portion of the effector region should be of a sufficient length(and GC content) to substantially prevent alternative splicing of thealternative exon upon ligand binding the aptamer, while also allowingaccess to the splice site when the ligand is not present in sufficientquantities. In embodiments of the invention, the stem portion of theeffector region comprises stem sequence in addition to the 5′ sssequence of the 3′ intron and its complementary sequence. In embodimentsof the invention, this additional stem sequence comprises sequence fromthe aptamer stem. The length and sequence of the stem portion can bemodified using known techniques in order to identify stems that allowacceptable background expression of the target gene when no ligand ispresent and acceptable expression levels of the target gene when theligand is present. If the stem is, for example, too long it may hideaccess to the 5′ ss sequence of the 3′ intron in the presence or absenceof ligand. If the stem is too short, it may not form a stable stemcapable of sequestering the 5′ ss sequence of the 3′ intron, in whichcase the alternative exon will be spliced into the target gene messagein the presence or absence of ligand. In one embodiment, the totallength of the effector region stem is between about 7 base pairs toabout 20 base pairs. In some embodiments, the length of the stem isbetween about 8 base pairs to about 11 base pairs. In some embodiments,the length of the stem is 8 base pairs to 11 base pairs. In addition tothe length of the stem, the GC base pair content of the stem can bealtered to modify the stability of the stem.

Aptamer/Ligand

The term “aptamer” as used herein refers to an RNA polynucleotide (orthe DNA encoding the RNA polynucleotide) that specifically binds to aligand or to an RNA polynucleotide that is being screened to identifyspecific binding to a ligand (i.e., a prospective aptamer). A library ofaptamers is a collection of prospective aptamers comprising multipleprospective aptamers having a nucleotide sequence that differs fromother members of the library by at least one nucleotide.

The term “ligand” refers to a molecule that is specifically bound by anaptamer, or to a prospective ligand that is being screened for theability to bind to one or more aptamers. A library of ligands is acollection of ligands and/or prospective ligands.

In one embodiment, the ligand is a low molecular weight (less than about1,000 Daltons) molecule including, for example, lipids, monosaccharides,second messengers, co-factors, metal ions, other natural products andmetabolites, nucleic acids, as well as most therapeutic drugs. In oneembodiment, the ligand is a polynucleotide with 2 or more nucleotidebases.

In one embodiment, the ligand is selected from the group consisting of8-azaguanine, adenosine 5′-monophosphate monohydrate, amphotericin B,avermectin B1, azathioprine, chlormadinone acetate, mercaptopurine,moricizine hydrochloride, N6-methyladenosine, nadide, progesterone,promazine hydrochloride, pyrvinium pamoate, sulfaguanidine, testosteronepropionate, thioguanosine, Tyloxapol and Vorinostat.

In certain embodiments, the methods of the present invention are used toidentify a ligand that is an intracellular molecule that binds to theaptamer (i.e., an endogenous ligand) in the polynucleotide cassettethereby causing expression of the reporter gene. For example, cells witha reporter gene containing the polynucleotide cassette for theaptamer/ligand mediated expression, can be exposed to a condition, suchas heat, growth, transformation, or mutation, leading to changes in cellsignaling molecules, metabolites, peptides, lipids, ions (e.g., Ca²⁺),etc. that can bind to the aptamer and cause expression of the reportergene. Thus, the methods of the present invention, can be used toidentify aptamers that bind to intracellular ligands in response tochanges in cell state, including, e.g., a change in cell signaling, cellmetabolism, or mutations within the cells. In another embodiment, thepresent invention is used to identify aptamers that bind intracellularligands present in differentiated cells. For example, the methods of thepresent invention may be used to identify ligands or aptamers that bindligands that are present in induced pluripotent stem cells. In oneembodiment, the methods of the present invention can be used to screenfor response to cell differentiation in vivo, or physiological changesof cells in vivo.

Aptamer ligands can also be cell endogenous components that increasesignificantly under specific physiological/pathological conditions, suchas oncogenic transformation—these may include second messenger moleculessuch as GTP or GDP, calcium; fatty acids, or fatty acids that areincorrectly metabolized such as 13-HODE in breast cancer (Flaherty, J Tet al., Plos One, Vol. 8, e63076, 2013, incorporated herein byreference); amino acids or amino acid metabolites; metabolites in theglycolysis pathway that usually have higher levels in cancer cells or innormal cells in metabolic diseases; and cancer-associated molecules suchas Ras or mutant Ras protein, mutant EGFR in lung cancer,indoleamine-2,3-dioxygenase (IDO) in many types of cancers. Endogenousligands include progesterone metabolites in breast cancer as disclosedby J P Wiebe (Endocrine-Related Cancer (2006) 13:717-738, incorporatedherein by reference). Endogenous ligands also include metabolites withincreased levels resulting from mutations in key metabolic enzymes inkidney cancer such as lactate, glutathione, kynurenine as disclosed byMinton, D R and Nanus, D M (Nature Reviews, Urology, Vol. 12, 2005,incorporated herein by reference).

The specificity of the binding of an aptamer to a ligand can be definedin terms of the comparative dissociation constants (Kd) of the aptamerfor its ligand as compared to the dissociation constant of the aptamerfor unrelated molecules. Thus, the ligand is a molecule that binds tothe aptamer with greater affinity than to unrelated material. Typically,the Kd for the aptamer with respect to its ligand will be at least about10-fold less than the Kd for the aptamer with unrelated molecules. Inother embodiments, the Kd will be at least about 20-fold less, at leastabout 50-fold less, at least about 100-fold less, and at least about200-fold less. An aptamer will typically be between about 15 and about200 nucleotides in length. More commonly, an aptamer will be betweenabout 30 and about 100 nucleotides in length.

The aptamers that can be incorporated as part of the riboswitch andscreened by methods of the present invention can be a naturallyoccurring aptamer, or modifications thereof, or aptamers that aredesigned de novo or synthetic screened through systemic evolution ofligands by exponential enrichment (SELEX). Examples of aptamers thatbind small molecule ligands include, but are not limited totheophylline, dopamine, sulforhodamine B, and cellobiose kanamycin A,lividomycin, tobramycin, neomycin B, viomycin, chloramphenicol,streptomycin, cytokines, cell surface molecules, and metabolites. For areview of aptamers that recognize small molecules, see, e.g., Famulok,Science 9:324-9 (1999) and McKeague, M. & DeRosa, M. C. J. Nuc. Aci.2012. In another embodiment, the aptamer is a complementarypolynucleotide.

In one embodiment, the aptamer is prescreened to bind a particular smallmolecule ligand in vitro. Such methods for designing aptamers include,for example, SELEX. Methods for designing aptamers that selectively binda small molecule using SELEX are disclosed in, e.g., U.S. Pat. Nos.5,475,096, 5,270,163, and Abdullah Ozer, et al. Nuc. Aci. 2014, whichare incorporated herein by reference. Modifications of the SELEX processare described in U.S. Pat. Nos. 5,580,737 and 5,567,588, which areincorporated herein by reference.

Previous selection techniques for identifying aptamers generally involvepreparing a large pool of DNA or RNA molecules of the desired lengththat contain a region that is randomized or mutagenized. For example, anoligonucleotide pool for aptamer selection might contain a region of20-100 randomized nucleotides flanked by regions of defined sequencethat are about 15-25 nucleotides long and useful for the binding of PCRprimers. The oligonucleotide pool is amplified using standard PCRtechniques, or other means that allow amplification of selected nucleicacid sequences. The DNA pool may be transcribed in vitro to produce apool of RNA transcripts when an RNA aptamer is desired. The pool of RNAor DNA oligonucleotides is then subjected to a selection based on theirability to bind specifically to the desired ligand. Selection techniquesinclude, for example, affinity chromatography, although any protocolwhich will allow selection of nucleic acids based on their ability tobind specifically to another molecule may be used. Selection techniquesfor identifying aptamers that bind small molecules and function within acell may involve cell based screening methods. In the case of affinitychromatography, the oligonucleotides are contacted with the targetligand that has been immobilized on a substrate in a column or onmagnetic beads. The oligonucleotide is preferably selected for ligandbinding in the presence of salt concentrations, temperatures, and otherconditions which mimic normal physiological conditions. Oligonucleotidesin the pool that bind to the ligand are retained on the column or bead,and nonbinding sequences are washed away. The oligonucleotides that bindthe ligand are then amplified (after reverse transcription if RNAtranscripts were utilized) by PCR (usually after elution). The selectionprocess is repeated on the selected sequences for a total of about threeto ten iterative rounds of the selection procedure. The resultingoligonucleotides are then amplified, cloned, and sequenced usingstandard procedures to identify the sequences of the oligonucleotidesthat are capable of binding the target ligand. Once an aptamer sequencehas been identified, the aptamer may be further optimized by performingadditional rounds of selection starting from a pool of oligonucleotidescomprising a mutagenized aptamer sequence.

In one embodiment, the aptamer or aptamer library for use in the presentinvention comprises one or more aptamers identified in an in vitroaptamer screen. In one embodiment, the aptamers identified in the invitro aptamer screen have one or more nucleotides randomized to create aprospective aptamer library for use in the methods of the presentinvention.

The Alternative Exon

The alternative exon that is part of the gene regulation polynucleotidecassette of the present invention can be any polynucleotide sequencecapable of being transcribed to a pre-mRNA and alternatively splicedinto the mRNA of the target gene. The alternative exon that is part ofthe gene regulation cassette of the present invention contains at leastone sequence that inhibits translation such that when the alternativeexon is included in the target gene mRNA, expression of the target genefrom that mRNA is prevented or reduced. In a preferred embodiment, thealternative exon contains a stop codon (TGA, TAA, TAG) that is in framewith the target gene when the alternative exon is included in the targetgene mRNA by splicing. In embodiments, the alternative exon comprises,in addition to a stop codon, or as an alternative to a stop codon, othersequence that reduces or substantially prevents translation when thealternative exon is incorporated by splicing into the target gene mRNAincluding, e.g., a microRNA binding site, which leads to degradation ofthe mRNA. In one embodiment, the alternative exon comprises a miRNAbinding sequence that results in degradation of the mRNA. In oneembodiment, the alternative exon encodes a polypeptide sequence whichreduces the stability of the protein containing this polypeptidesequence. In one embodiment, the alternative exon encodes a polypeptidesequence which directs the protein containing this polypeptide sequencefor degradation.

The basal or background level of splicing of the alternative exon can beoptimized by altering exon splice enhancer (ESE) sequences and exonsplice suppressor (ESS) sequences and/or by introducing ESE or ESSsequences into the alternative exon. Such changes to the sequence of thealternative exon can be accomplished using methods known in the art,including, but not limited to site directed mutagenesis. Alternatively,oligonucleotides of a desired sequence (e.g., comprising all or part ofthe alternative exon) can be obtained from commercial sources and clonedinto the gene regulation cassette. Identification of ESS and ESEsequences can be accomplished by methods known in the art, including,for example using ESEfinder 3.0 (Cartegni, L. et al. ESEfinder: a webresource to identify exonic splicing enhancers. Nucleic Acid Research,2003, 31(13): 3568-3571) and/or other available resources.

In one embodiment, the alternative exon is exogenous to the target gene,although it may be derived from a sequence originating from the organismwhere the target gene will be expressed. In one embodiment thealternative exon is a synthetic sequence. In one embodiment, thealternative exon is a naturally-occurring exon. In another embodiment,the alternative exon is derived from all or part of a known exon. Inthis context, “derived” refers to the alternative exon containingsequence that is substantially homologous to a naturally occurring exon,or a portion thereof, but may contain various mutations, for example, tointroduce a stop codon that will be in frame with the target reportergene sequence, or to introduce or delete an exon splice enhancer, and/orintroduce delete an exon splice suppressor. In one embodiment, thealternative exon is derived from exon 2 of the human dihydrofolatereductase gene (DHFR), mutant human Wilms tumor 1 exon 5, mousecalcium/calmodulin-dependent protein kinase II delta exon 16, or SIRT1exon 6.

5′ and 3′ Intronic Sequences

The alternative exon is flanked by 5′ and 3′ intronic sequences. The 5′and 3′ intronic sequences that can be used in the gene regulationcassette can be any sequence that can be spliced out of the target genecreating either the target gene mRNA or the target gene comprising thealternative exon in the mRNA, depending upon the presence or absence ofa ligand that binds the aptamer. The 5′ and 3′ introns each has thesequences necessary for splicing to occur, i.e., splice donor, spliceacceptor and branch point sequences. In one embodiment, the 5′ and 3′intronic sequences of the gene regulation cassette are derived from oneor more naturally occurring introns or a portion thereof. In oneembodiment, the 5′ and 3′ intronic sequences are derived from atruncated human beta-globin intron 2 (IVS2Δ). In other embodiments the5′ and 3′ intronic sequences are derived from the SV40 mRNA intron (usedin pCMV-LacZ vector from Clontech), intron 6 of human triose phosphateisomerase (TPI) gene (Nott Ajit, et al. RNA. 2003, 9:6070617), or anintron from human factor IX (Sumiko Kurachi et al. J. Bio. Chem. 1995,270(10), 5276), the target gene's own endogenous intron, or any genomicfragment or synthetic introns (Yi Lai, et al. Hum Gene Ther.2006:17(10):1036) that contain elements that are sufficient forregulated splicing (Thomas A. Cooper, Methods 2005 (37):331).

In one embodiment, the alternative exon and riboswitch of the presentinvention are engineered to be in an endogenous intron of a target gene.That is, the intron (or substantially similar intronic sequence)naturally occurs at that position of the target gene. In this case, theintronic sequence immediately upstream of the alternative exon isreferred to as the 5′ intron or 5′ intronic sequence, and the intronicsequence immediately downstream of the alternative exon is referred toas the 3′ intron or 3′ intronic sequence. In this case, the endogenousintron is modified to contain a splice acceptor sequence and splicedonor sequence flanking the 5′ and 3′ ends of the alternative exon.

The splice donor and splice acceptor sites in the gene regulationcassette of the present invention can be modified to be strengthened orweakened. That is, the splice sites can be modified to be closer to theconsensus for a splice donor or acceptor by standard cloning methods,site directed mutagenesis, and the like. Splice sites that are moresimilar to the splice consensus tend to promote splicing and are thusstrengthened. Splice sites that are less similar to the splice consensustend to hinder splicing and are thus weakened. The consensus for thesplice donor of the most common class of introns (U2) is A/C A G∥G T A/GA G T (where ∥ denotes the exon/intron boundary). The consensus for thesplice acceptor is C A G∥G (where ∥ denotes the exon/intron boundary).The frequency of particular nucleotides at the splice donor and acceptorsites are described in the art (see, e.g., Zhang, M. Q., Hum Mol Genet.1988. 7(5):919-932). The strength of 5′ and 3′ splice sites can beadjusted to modulate splicing of the alternative exon.

Additional modifications to 5′ and 3′ introns in the gene regulationcassette can be made to modulate splicing including modifying, deleting,and/or adding intronic splicing enhancer elements and/or intronicsplicing suppressor elements, and/or modifying the branch site sequence.

In one embodiment, the 5′ intron has been modified to contain a stopcodon that will be in frame with the reporter gene. The 5′ and 3′intronic sequences can also be modified to remove cryptic slice sites,which can be identified with publicly available software (see, e.g.,Kapustin, Y. et al. Nucl. Acids Res. 2011. 1-8). The lengths of the 5′and 3′ intronic sequences can be adjusted in order to, for example, meetthe size requirements for viral expression constructs. In oneembodiment, the 5′ and 3′ intronic sequences are independently fromabout 50 to about 300 nucleotides in length. In one embodiment, the 5′and 3′ intronic sequences are independently from about 125 to about 240nucleotides in length.

Reporter Genes

The screening methods of the present invention utilize a gene regulationcassette that is used to regulate the expression of a target gene (e.g.,a reporter gene) that can be expressed in a target cell, tissue ororganism. The reporter gene can be any gene whose expression can be usedto detect the specific interaction of a ligand with the aptamer in thegene regulation cassette. In one embodiment, the reporter gene encodes afluorescent protein, including, e.g., a green fluorescent protein (GFP),a cyan fluorescent protein, a yellow fluorescent protein, an orangefluorescent protein, a red fluorescent protein, or a switchablefluorescent protein. In another embodiment, the reporter gene encodes aluciferase enzyme including, e.g., firefly luciferase, Renillaluciferase, or secretory Gaussia luciferase. In one embodiment, thereporter gene is β-galactosidase. In one embodiment, the reporter ishorseradish peroxidase (HRP). In one embodiment, the reporter gene isselected from the group consisting of a nuclear protein, transporter,cell membrane protein, cytoskeleton protein, receptor, growth hormone,cytokine, signaling molecule, regulatory RNA, antibody, and therapeuticproteins or peptides.

Expression Constructs

The present invention contemplates the use of a recombinant vector forintroduction into target cells a polynucleotide encoding a reporter geneand containing the gene regulation cassette of the present invention. Inmany embodiments, the recombinant DNA construct of this inventionincludes additional DNA elements including DNA segments that provide forthe replication of the DNA in a host cell and expression of the targetgene in that cell at appropriate levels. The ordinarily skilled artisanappreciates that expression control sequences (promoters, enhancers, andthe like) are selected based on their ability to promote expression ofthe reporter gene in the target cell. “Vector” means a recombinantplasmid, yeast artificial chromosome (YAC), mini chromosome, DNAmini-circle or virus (including virus derived sequences) that comprisesa polynucleotide to be delivered into a host cell, either in vitro or invivo. In one embodiment, the recombinant vector is a viral vector or acombination of multiple viral vectors. Viral vectors for theaptamer-mediated expression of a reporter gene in a target cell areknown in the art and include adenoviral (AV) vectors, adeno-associatedvirus (AAV) vectors, retroviral and lentiviral vectors, and Herpessimplex type 1 (HSV1) vectors.

Methods for Dividing Aptamer Libraries into Sub-Libraries

Another aspect of the present invention provides methods to divide largeoligonucleotide libraries into smaller sub-libraries and approaches tomake cellular assay-screenable plasmid libraries of aptamer-basedsynthetic riboswitches. One aspect of the invention provides a methodfor splitting an oligonucleotide library into smaller sub-librariescomprising the steps:

-   -   (a) providing an oligonucleotide library wherein the        oligonucleotides in the library comprise multiple 5′ and 3′        constant regions,    -   (b) performing a two-cycle PCR using the oligonucleotide library        as the template and first primers and second primers that are        complementary to the 5′ and 3′ constant regions,    -   (c) isolating the products of the two-cycle PCR, and    -   (d) PCR amplifying a subset of the isolated products of the        two-cycle PCR using primers complementary to a subset of the        unique 5′ and 3′ constant regions.

In one embodiment, the oligonucleotide library is a randomized aptamerlibrary containing one or more randomized nucleotides. The aptamersequences are flanked by a left and right constant region, which containa restriction site for subsequent cloning.

In one embodiment, the first or second primer in the two-cycle PCRcomprises a label selected from the group consisting of biotin,digoxigenin (DIG), bromodeoxyuridine (BrdU), fluorophore, a chemicalgroup, e.g. thiol group, or a chemical group e.g. azides used in ClickChemistry. These molecules can be linked to the oligonucleotides, andtheir interacting molecules, such as streptavidin or modified forms ofavidin for biotin, antibodies against DIG or BrdU or fluorophore, or asecond thiol group to form disulfide, alkyne group for azides, can beimmobilized on a solid surface to facilitate the isolation of labeledoligonucleotides.

Once an aptamer library is divided into sub-libraries of aptamers, theaptamers in one or more sub-libraries are introduced into the generegulation polynucleotide cassette to generate a riboswitch library andscreened for ligand binding by the methods provided herein.

Methods for Dividing Riboswitch Libraries into Sub-Libraries

In one aspect the present invention provides a method for dividing alibrary of riboswitches into sub-libraries. A library of riboswitches asused herein is a plasmid library comprising a gene regulationpolynucleotide cassette, e.g., as described herein and inPCT/US2016/016234, comprising a plurality of aptamers where individualmembers of the library comprise aptamer sequences that is different fromother members of the library. In embodiments, the aptamers in thelibrary of riboswitches comprise one or more randomized nucleotides. Inembodiments, the plasmid riboswitch library was generated from anaptamer sub-library created by the methods described herein.

The method for dividing the riboswitch library into sub-librariescomprises the steps of:

-   -   (a) introducing a library of aptamers into a plasmid comprising        a gene regulation polynucleotide cassette described herein to        make a riboswitch library;    -   (b) introducing the riboswitch library into bacteria (e.g., E.        coli);    -   (c) collecting bacterial clones (for example by picking        bacterial colonies) and extracting plasmid DNA to obtain plasmid        sub-libraries of riboswitches (referred to herein as primary        sub-libraries);    -   (d) optionally, generating secondary sub-libraries of        riboswitches from a primary plasmid sub-library of riboswitches        by introducing a primary sub-library into bacteria, collecting        bacterial clones and isolating the plasmid DNA.

Methods for introducing sequences into plasmids to generate a libraryare known in the art as are methods for introducing plasmids intobacteria and obtaining bacterial clones. Bacterial clones containing amember of the plasmid riboswitch library may be collected by platingbacteria and picking individual colonies. Pooled plasmids from theseclones form the sub-library. The number of bacterial clones collecteddetermines the size (number of unique members) of the sub-library ofriboswitches and multiple sub-libraries may be generated. One or moreprimary sub-libraries can be further divided to create secondarysub-libraries to further reduce the size of the sub-libraries. Thesub-libraries are screened using the methods described herein byintroducing one or more sub-library into eukaryotic cells, exposing thecells to a ligand of interest, and measuring the expression of thereporter gene from the gene regulation polynucleotide cassette. Increasein reporter gene expression in response to ligand indicates that one ormore members of the library comprises an aptamer that binds to theligand in the context of the riboswitch. Thus, the size of thesub-library that can be screened may be determined by the sensitivity ofthe assay for measuring reporter gene expression. In embodiments of theinvention, a sub-library comprises about 50 to about 600 unique members(although some members may be repeated in other sub-libraries).

It is to be understood and expected that variation in the principles ofinvention herein disclosed can be made one of ordinary skill in the artand it is intended that such modifications are to be included within thescope of the present invention. All references cited herein are herebyincorporated by reference in their entirety. The following examplesfurther illustrate the invention, but should not be construed to limitthe scope of the invention.

Example 1

Mammalian cell-based screening for aptamer/ligands using splicing-basedgene regulating riboswitches.

Procedures:

Construction of riboswitches: Riboswitches were constructed as describedin PCT/US2016/016234 (in particular Examples 3 to 6), incorporatedherein by reference. A truncated human beta-globin intron sequence wassynthesized and inserted in the coding sequence of a firefly luciferasegene. A mutant human DHFR exon 2 was synthesized and inserted in themiddle of this truncated beta-globin intron sequence using Golden gatecloning strategy. Aptamers including xpt-G/A¹, ydhl-G/A², yxj³, add⁴,gdg6-G/A⁵ (citations for the aptamers are incorporated herein byreference) were synthesized as oligonucleotides (“oligos”) with 4nucleotide overhang at the 5′ end that are complementary to twodifferent BsaI sites individually (IDT), annealed and ligated toBsaI-digested mDHFR-Luci-acceptor vector.

Transfection: 3.5×10⁴ HEK 293 cells were plated in a 96-well flat bottomplate the day before transfection. Plasmid DNA (500 ng) was added to atube or a 96-well U-bottom plate. Separately, TransIT-293 reagent(Mirus; 1.4 μL) was added to 50 μL Opti-mem I media (Life Technologies),and allowed to sit for 5 minutes at room temperature (RT). Then, 50 μLof this diluted transfection reagent was added to the DNA, mixed, andincubated at RT for 20 min. Finally, 7 μL of this solution was added toa well of cells in the 96-well plate.

Firefly luciferase assay of cultured cells: 24 hours after media change,plates were removed from the incubator, and equilibrated to RT forseveral minutes on a lab bench, then aspirated. Glo-lysis buffer(Promega, 100 μL, RT) was added, and the plates maintained at RT for atleast 5 minutes. Then, the well contents were mixed by 50 μLtrituration, and 20 μL of each sample was mixed with 20 μL of bright-gloreagent (Promega) that had been diluted to 10% in glo-lysis buffer. 96wells were spaced on an opaque white 384-well plate. Following a 5 minincubation at RT, luminescence was measured using a Tecan machine with500 mSec read time. The luciferase activity was expressed as meanrelative light unit (RLU)±S.D., and fold induction was calculated as theluciferase activity obtained with guanine treatment divided byluciferase activity obtained without guanine treatment.

Results:

Starting with luciferase as a reporter gene, a gene expression platformwas created by inserting a human β-globin intron in the middle of thecoding sequence of firefly luciferase and a mutant stop codon-containinghuman DHFR exon 2 in the intron portion. The reporter gene expression isthus controlled by the inclusion or exclusion of the mDHFR exoncontaining a stop codon that is in frame with the reporter gene. In thissystem, a hairpin structure in the mRNA formed by U1 binding site and aninserted complimentary sequence blocks the inclusion of mDHFR exon,therefore enabling target gene expression (FIG. 1 a ). To make theformation of hairpin structure regulatable, thus target gene expressioncontrollable by small molecules, we grafted either synthetic aptamers(theophylline) or natural aptamers (xpt-G/A, yxj, ydhl-A/G, add-A/Gaptamers) or hybrid aptamer gdg6-G/A) to this splicing-based generegulation platform in between the U1 binding site and its complementarysequence, and generated synthetic riboswitches that regulate geneexpression in mammalian cells. By using our splicing-based generegulation cassette and inserting different aptamers into our syntheticriboswitch construct, we demonstrated different functional responses toligand in the context of mammalian cells. Those riboswitches withguanine aptamers responds to guanine as well as guanosine as shown inFIG. 1 b . The xpt-guanine riboswitch, xpt-G17 (disclosed inPCT/US2016/016234, see, e.g., SEQ ID NO.:15, incorporated herein byreference), yielded high dynamic range of induction of reporter geneexpression in response to ligand with its natural ligand treatment.

Although the natural aptamer-based riboswitches have high dynamic rangein regulating gene expression in mammalian cells, the nature of theligands for those natural aptamers limits their applicability in vivo.Taking advantage of our highly dynamic gene regulation platform withriboswitches, we first chose a list of guanine analogs that havedifferent chemical groups at N2 position to test their activities onxpt-G17 riboswitch. As shown in FIG. 1 c , at 500 μM concentration,several N2 compounds induced luciferase activity in cells with xpt-G17construct, with N2-Phenoxyacetyl guanine being the most potent(1303-fold induction) as shown in FIG. 1 d . To expand the list ofcompounds for use as potential ligands, the Prestwick library (acollection of 1280 clinically approved drugs) was used at 100 μM toscreen for optimal ligands to activate known aptamers in the context ofmammalian cells. As shown in FIG. 1 e from a preliminary screen, theguanine riboswitch xpt-G17 responded not only to guanine, but also to8-azaguanine, Nadide, N6-methyladenosine, Testosterone propionate,Adenosine 5′-monophosphate monohydrate, amphotericin B, Thioguanine,Tyloxapol, Progesteron and Chlormadinone acetate as shown in FIG. 1 e ,as well as a number of other compounds as listed in Table 1.Intriguingly, some of these compounds that showed activities on xpt-G17riboswitch are structurally very different from guanine or guanosine.The Prestwick library was further screened with other 8 purineriboswitches, and a number of compounds that can activate theriboswitches in inducing luciferase activity were obtained (Table 1).These results demonstrate the important usage of the riboswitch systemin discovering potential optimal ligands for known aptamer in cellularenvironment, further highlighting the importance of generating aptamersin the context of the cells within which the riboswitch will be requiredto function.

TABLE 1 Fold Riboswitch Compound name Induction xpt-G17 8-Azaguanine131.0 xpt-G17 Azathioprine 6.2 xpt-G17 Cinnarizine 3.5 xpt-G17Pimethixene maleate 4.7 xpt-G17 N6-methyladenosine 30.7 xpt-G17thioguanosine 21.0 xpt-G17 Adenosine 5'-monophosphate monohydrate 28.4xpt-G17 Amphotericin B 21.5 xpt-G17 Testosterone propionate 29.0 xpt-G17Haloprogin 5.1 xpt-G17 Idebenone 3.3 xpt-G17 Zotepine 4.3 xpt-G17Progesterone 12.0 xpt-G17 Tenatoprazole 3.2 xpt-G17 Acetopromazinemaleate salt 4.5 xpt-G17 Etofenamate 7.5 xpt-G17 Mercaptopurine 3.6xpt-G17 Avermectin B1 4.0 xpt-G17 Promazine hydrochloride 3.7 xpt-G17Nadide 40.9 xpt-G17 Trimeprazine tartrate 4.9 xpt-G17 Promethazinehydrochloride 5.3 xpt-G17 Tyloxapol 16.2 xpt-G17 Chlormadinone acetate10.3 xpt-G17 Pyrvinium pamoate 5.1 gdg6-AS 8-Azaguanine 305.9 gdg6-ASCimetidine 3.0 gdg6-AS Azathioprine 19.9 gdg6-AS Diperodon hydrochloride3.0 gdg6-A8 Pimethixene maleate 9.2 gdg6-A8 thioguanosine 20.1 gdg6-A8Acetopromazine maleate salt 8.6 gdg6-A8 Mercaptopurine 17.2 gdg6-A8Opipramol dihydrochloride 3.1 gdg6-A8 Promazine hydrochloride 12.6gdg6-A8 Methotrimeprazine maleate salt 5.0 gdg6-A8 Dienestrol 4.3gdg6-A8 Trimipramine maleate salt 5.3 gdg6-A8 Trimeprazine tartrate 8.7gdg6-A8 Promethazine hydrochloride 4.8 gdg6-A8 Vorinostat 6.4 gdg6-A8Methiazole 3.8 yxj-A6 8-Azaguanine 55.6 yxj-A6 Azathioprine 6.6 yxj-A6Pimethixene maleate 4.9 yxj-A6 thioguanosine 10.2 yxj-A6 Acetopromazinemaleate salt 3.1 yxj-A6 Mercaptopurine 10.0 yxj-A6 Promazinehydrochloride 6.6 yxj-A6 Sulfaquinoxaline sodium 3.4 yxj-A6 Trimipraminemaleate salt 3.3 yxj-A6 Trimeprazine tartrate 7.0 yxj-A6 Promethazinehydrochloride 7.9 yxj-A6 Pirlindole mesylate 3.2 add-A6 8-Azaguanine22.1 add-A6 Azathioprine 6.5 add-A6 Pimethixene maleate 4.2 add-A6thioguanosine 5.9 add-A6 Acetopromazine maleate salt 3.5 add-A6Mercaptopurine 15.0 add-A6 Opipramol dihydrochloride 4.3 add-A6Promazine hydrochloride 10.5 add-A6 Sulfaquinoxaline sodium 3.3 add-A6Terazosin hydrochloride 3.5 add-A6 Trimipramine maleate salt 3.4 add-A6Trimeprazine tartrate 4.0 add-A6 Promethazine hydrochloride 4.5 add-A6Deptropine citrate 3.3 add-A6 Alcuronium chloride 4.2 ydhl-A6Hydrochiorothiazide 3.3 ydhl-A6 8-Azaguanine 3.7 ydhl-A6 Ticlopidinehydrochloride 3.1 ydhl-A6 Alverine citrate salt 4.2 ydhl-A6 Vincamine3.3 ydhl-A6 Idebenone 3.5 ydhl-A6 Pepstatin A 4.0 ydhl-A6 Modafinil 3.8ydhl-A6 Benperidol 3.1 ydhl-A6 Digoxigenin 4.5 ydhl-A6 Digoxigenin 3.3ydhl-A6 Moricizine hydrochloride 10.3 ydhl-A6 Pivmecillinamhydrochloride 3.2 ydhl-A6 Piperidolate hydrochloride 3.4 ydhl-A6Oxaprozin 3.4 ydhl-A6 Imidurea 4.3 ydhl-A6 Mecamylamine hydrochloride3.2 xpt-A8 8-Azaguanine 95.1 xpt-A8 Azathioprine 5.9 xpt-A8 Pimethixenemaleate 3.3 xpt-A8 thioguanosine 11.8 xpt-A8 Mercaptopurine 3.4 xpt-A8Promazine hydrochloride 4.1 xpt-A8 Promethazine hydrochloride 5.4gdg6-G8 8-Azaguanine 42.3 gdg6-G8 Azathioprine 16.2 gdg6-G8 Pimethixenemaleate 5.1 gdg6-G8 thioguanosine 15.9 gdg6-G8 Amphotericin B 3.8gdg6-G8 Acetopromazine maleate salt 3.2 gdg6-G8 Mercaptopurine 16.2gdg6-G8 Promazine hydrochloride 6.2 gdg6-G8 Trimipramine maleate salt3.3 gdg6-G8 Trimeprazine tartrate 6.2 gdg6-G8 Promethazine hydrochloride6.5 gdg6-G8 Penbutolol sulfate 3.3 gdg6-G8 Vorinostat 10.2 gdg6-G8Methiazole 3.3 gdg6-G8 Estriol 4.3 add-G6 8-Azaguanine 47.9 add-G6Niclosamide 3.0 add-G6 Azathioprine 11.4 add-G6 Lynestrenol 3.8 add-G6R(−)Apomorphine hydrochloride hemihydrate 3.4 add-G6 Danazol 3.7 add-G6Camptothecine (S, +) 5.7 add-G6 Cinnarizine 3.6 add-G6 Pimethixenemaleate 6.6 add-G6 Flunarizine dihydrochloride 4.7 add-G6N6-methyladenosine 20.8 add-G6 thioguanosine 7.9 add-G6 Adenosine5'-monophosphate monohydrate 9.4 add-G6 Bepridil hydrochloride 4.4add-G6 Amphotericin B 10.7 add-G6 Testosterone propionate 8.8 add-G6Haloprogin 5.9 add-G6 Idebenone 6.7 add-G6 Meclocycline sulfosalicylate3.4 add-G6 Progesterone 6.0 add-G6 Acetopromazine maleate salt 5.0add-G6 Etofenamate 5.1 add-G6 Mercaptopurine 14.3 add-G6 Benzamilhydrochloride 3.0 add-G6 Avermectin B1 11.8 add-G6 Promazinehydrochloride 5.4 add-G6 Nadide 30.8 add-G6 Trimipramine maleate salt3.4 add-G6 Trimeprazine tartrate 6.2 add-G6 Simvastatin 6.2 add-G6Promethazine hydrochloride 6.7 add-G6 Protriptyline hydrochloride 5.0add-G6 Chlormadinone acetate 26.1 add-G6 Nomegestrol acetate 3.5 add-G6Pyrvinium pamoate 15.8 add-G6 Sertaconazole Nitrate 6.5 add-G6Vorinostat 3.6 ydhl-G8 Sulfaguanidine 13.9 ydhl-G8 8-Azaguanine 35.6ydhl-G8 N6-methyladenosine 10.7 ydhl-G8 thioguanosine 7.5 ydhl-G8Adenosine 5'-monophosphate monohydrate 5.9 ydhl-G8 Amphotericin B 6.5ydhl-G8 Tetracaïne hydrochloride 3.6 ydhl-G8 Acetopromazine maleate salt3.9 ydhl-G8 Azelastine hydrochloride 3.0 ydhl-G8 Etofenamate 4.8 ydhl-G8Mercaptopurine 3.6 ydhl-G8 Promazine hydrochloride 5.2 ydhl-G8 Nadide11.7 ydhl-G8 Trimeprazine tartrate 5.0 ydhl-G8 Chlormadinone acetate10.4 ydhl-G8 Pyrvinium pamoate 5.5 ydhl-G8 Vorinostat 3.0

Sequences for riboswitches used in the Prestwick library screen areprovided below with the stem sequences in capital letters, and theaptamer sequences in lowercase letters:

xpt-A8 (SEQ ID NO: 1):GTAATGTataatcgcgtggatatggcacgcaagtttctaccgggcaccgt aaatgtccgattACATTACadd-G6 (SEQ ID NO: 2):GTAATGTGtataatcctaatgatatggtttgggagtttctaccaagagccttaaactcttgactaCACATTAC add-A6 (SEQ ID NO: 3):GTAATGTGtataatcctaatgatatggtttgggagtttctaccaagagccttaaactcttgattaCACATTAC gdg6-A8 (SEQ ID NO: 4):GTAATGTacagggtagcataatgggctactgaccccgccgggaaacctat ttcccgattACATTACgdg6-G8 (SEQ ID NO: 5):GTAATGTacagggtagcataatgggctactgaccccgccgggaaacctat ttcccgactACATTACYdh1-G8 (SEQ ID NO: 6):GTAATGTataacctcaataatatggtttgagggtgtctaccaggaaccgt aaaatcctgactACATTACYdh1-A6 (SEQ ID NO: 7):GTAATGTGtataacctcaataatatggtttgagggtgtctaccaggaaccgtaaaatcctgattaCACATTAC yxj-A6 (SEQ ID NO: 8):GTAATGTGtatatgatcagtaatatggtctgattgtttctacctagtaaccgtaaaaaactagattaCACATTAC

Example 2

Design and synthesis of aptamer library.

Procedure

To generate an aptamer library, nucleotides at positions in the aptamerthat are identified from crystal structure^(6, 7) as potentiallyinvolved in ligand binding were randomized. In order to facilitateconstructing aptamers into riboswitches, the aptamer region was flankedby constant regions with type IIs restriction enzyme (e.g. BsaI) cutsites. This 153 bp ultramer oligonucleotides containing the aptamersequence with randomized bases were synthesized by IDT:GACTTCGGTCTCATCCAGAGAATGAAAAAAAAATCTTCAGTAGAAGGTAATGTATANNNGCGTGGATATGGCACGCNNGNNNNCNCCGGGCACCGTAAATGTCCGACTACATTACGCACCATTCTAAAGAATAACAGTGAAGAGACCAGACGG (N represents randomnucleotides) (SEQ ID NO: 9). To generate more sequence diversity in theaptamer library, bases at more positions can be randomized. A completelyrandom sequence can also be used to generate the aptamer library.

Results

As described in Example 1, we have successfully built syntheticriboswitches that regulate mammalian gene expression in responding tosmall molecule ligand treatment. One of the riboswitches xpt-G17 thatcontains xpt-G guanine aptamer in the splicing-based gene expressioncassette. Using luciferase as a reporter gene, we achieved a highdynamic range of gene regulation in response to guanine treatment, withinduction fold of 2000 at high concentration of guanine. Thisunprecedented dynamic range of gene regulation activity by theaptamer/ligand mediated alternative splicing constructs provides asystem to screen for aptamers against a desired ligand in mammaliancells, or screen for ligands which bind and activate known aptamers.

The xpt-G17 was selected as a platform to build a starting riboswitchlibrary. The configuration of oligonucleotide sequence was designed toreplace the original xpt-G guanine aptamer in the following cloningsteps. The nucleotides in the xpt-G guanine aptamer at positions thatare known to be critical for guanine binding based on crystallographyanalysis were randomized. Initially, 10 positions were randomized, whichgenerated a library of 1,048,576 aptamer sequences. When more than 10positions are randomized, libraries larger than 10⁶ sequences can begenerated. Though xpt-G guanine aptamer backbone sequence was used hereselectively to randomize, a similar approach can be used to generateaptamer libraries with other known aptamers, or even completely randomsequences without known ligands. Though we chose xpt-G17 as platformhere, it is important to note that riboswitches with different aptamers,or riboswitches based on mechanisms other than splicing can also be usedas a starting platform to generate randomized aptamer sequences.

Example 3

Splitting large randomized aptamer library into smaller sub-libraries ofaptamer.

Procedures

Oligonucleotides (oligos): JF or JR set of primers have 3′ portionsequence complementary to constant regions in the synthesized aptameroligos and 5′ portion sequence containing random 20 mer oligo sequences.F or R set of primers are complimentary to the random 20 mer oligosequences in the JF or JR primers. All the primers are synthesized atIDT. The JF primers were labeled with biotin at 5′ end (IDT).Synthesized oligos were suspended in DNase and RNase-free water to 100μM as stock solution, and diluted to desired concentration andquantified using Nanodrop machine or OliGreen method (ThermoFisher).

Two-cycle PCR amplification: To add biotinylated oligo-tag, two-cyclePCR amplification was performed using Pfx Platinum PCR kit followingmanufacturer's protocol in a reaction volume of 10 μl. The oligotemplates were used at desired copy numbers in PCR reaction (1 to 5copies per oligo sequence in the aptamer library). For the first cycleof amplification, only reverse primers JR were included. Theamplification was run at 94° C. for 2 minutes, then 94° C. for 10seconds, annealing with a touch-down program from 66° C. to 52° C.descending at 0.5° C. per minute. Then the polymerase reaction wasextended at 68° C. for 20 second followed by cooling down to 4° C. Then10 μl of PCR mixture without template but containing biotinylatedforward primers (biotin-JF) were added to the first cycle PCR tube forthe second cycle of amplification using the same PCR steps. The PCRproducts were ready for incubating with streptavidin-beads.

Isolation of biotinylated oligonucleotides (oligos): 2×Binding andWashing buffer (BW buffer) was made of 1×TE buffer (Ambion) with 2MNaCl. Dynabeads M-270 Streptavidin (ThermoFisher) (SA-beads) was blockedwith 20 μM yeast tRNA solution (Ambion) for 10 minutes at roomtemperature, and washed with 1×BW buffer twice, and re-suspended in thesame volume of 2×BW buffer as the initial volume of beads used. 50 μl ofthese treated beads were added to the PCR products together with 100 μlof 2×BW buffer and 30 μl of water. The 200 μl of biotinylated oligos andSA-beads mixture was incubated at room temperature for 60 minutes, thenbeads were denatured at 95° C. for 5 minutes, chilled immediately on iceand washed once with 1×BW buffer, twice with water for 5 minutesfollowing manufacturer's protocol. Washing solution was removed as muchas possible, and the washed beads were ready for PCR reaction.

Oligo sequence tag-specific PCRs: Beads with biotinylated PCR productswere added to a total 50 μl of PCR mix using Pfx Platinum PCR kit. Theprimers are a mixture of F and R set primers. The PCR was preheated at94° C. for 2 minutes, subject to 28 cycles of 94° C. for 15 seconds, 62°C. for 30 seconds, 68° C. for 20 seconds, and an additional extension at68° C. for 2 min. The PCR product was cooled to 12° C. and ready forsecond round of PCR. For the second round of PCR amplification, 1 μl ofthe PCR product from the first round of PCR was used as template, and asingle pair of F and R primers were used to amplify templates taggedwith the complementary sequences. The PCR reaction was preheated at 94°C., and amplified with 25 cycles of 94° C. for 15 seconds, 60° C. for 30seconds, 68° C. for 20 seconds, and an additional extension at 68° C.for 2 minutes.

Results

Although in vitro selection using systematic evolution of ligands byexponential enrichment (SELEX)^(8, 9) has been extensively applied toscreening large aptamer libraries usually with 10¹³ to 10¹⁴ sequencesfor generating numerous aptamers against a wide range of ligandsincluding metabolites, vitamin cofactors, metal ions, proteins and evenwhole cells¹⁰, methods for cell-based screens of such large randomizedaptamer libraries have not been developed. Moreover, few aptamersgenerated by SELEX have proved effective in a cellular environment,highlighting the importance of screening aptamers in the cellularenvironment where they will be required to function. In order forselected aptamers to work within cells, the binding of the specificaptamer to its ligand must have a functional consequence—which cannot betested via SELEX, which selects aptamers only based on ligand bindingunder in vitro conditions. One challenge of developing mammaliancell-based screens for aptamers is the low dynamic gene regulatory rangeof aptamer-based riboswitches in responding to ligand treatment. Inaddition to this fundamental limitation, the intrinsic low genetransduction efficiency in mammalian cells imposes another barrier toscreening libraries bigger than 10⁵ sequences. However, we developedsynthetic riboswitches that can generate up to several thousand-foldinduction of gene expression upon ligand treatment. This high dynamicrange of gene regulation provides the basis of a cell-based system forscreening aptamer/ligands. In order to select aptamers in eukaryoticcells from large aptamer libraries that have high sequence diversity,present invention provides multiple strategies and approaches todivide/split large aptamer libraries to smaller sub-libraries that canbe cloned into riboswitch cassette to generate plasmid libraries thatare screenable through mammalian cell-based assays.

The strategy of splitting large aptamer libraries is to first add a pairof unique sequences at both the 5′ and 3′ ends of the synthesized,randomized aptamer oligo sequences (as described in Example 2). In thesecond step of this strategy, aptamer sequences attached(tagged/labeled) with unique oligo sequences can be amplified usingsingle pair of primers complementary to each pair of sequence tags, thusgenerating different sub-libraries of aptamers (FIG. 3 a ). Thistwo-step process of tagging and PCR can be iterated to split the libraryto the desired sizes.

To attach unique sequence pairs to the template, we have developedmultiple approaches (FIG. 3 b ). One approach is to use PCR toincorporate unique sequences to templates (PCR approach). Otherapproaches include ligating single-strand sequence tag to single-strandtemplate using T4 RNA ligase and ligating by T4 DNA ligase double-strandsequence tags to double-strand templates which are generated by PCRamplification of single-strand oligonucleotide templates (ligationapproach). We have developed and tested a two-cycle PCR approach (FIG. 3c ), and currently are in a process of testing the ligation approachesto adding unique sequences tags.

For using PCR approach to attach sequence tags to generate taggedlibrary of aptamer, one set of PCR primers (JF and JR) was designed.This set of primers contains the tag sequence in the 5′ portion of theprimers, and in the 3′ portion of primers, the sequence that iscomplementary to the constant region in the synthesized aptamer oligos.In order to avoid the heterogeneity generated by multi-cycleconventional PCR¹¹ using high copy numbers of templates, a two-cycle PCRwas developed to attach sequence tag at one end of the template at eachcycle (FIG. 3 c ). In this two-cycle PCR, the copy number of therandomized oligo templates was kept minimum to decrease the chance foreach template to be attached with more than one pair of tag sequences.In order to isolate and purify the tagged templates, we labeled JFprimers with biotin molecules, so that magnetic streptavidin beads canbe used to separate biotinylated tagged templates from the rest of thereaction components (FIG. 3 d ). Due to the low copy number of templateswe started with the PCR tagging, the isolated biotinylated, taggedtemplates were amplified and expanded by PCR using a mixture of a set ofprimers (J and F primers) that are specific to the tag sequencesattached to the templates, generating the library of aptamers that haveunique pair of sequences at the ends (tagged library of randomizedaptamers). This PCR product then serves as template for PCR with asingle pair of J and R primers to amplify each tagged template, thusgenerating the sub-libraries of the original aptamer library.

In a pilot study where 2 biotin-labeled JF primers (JF1 and JF2) and 8reverse JR primers (JR1 to JR8) were used, resulting in total 16 uniquepairs of sequence tags. After generating the tagged library by PCR withtemplates at 1, 2.3 or 4.6 copies representing 63%, 90% or 99% of theinitial randomized aptamer library, respectively, different primers wereused to test the splitting strategy. As shown on the left panel of FIG.3 e , the tagged-templates were amplified by primers complementary tothe constant region (universal primers) in the aptamer, which amplifyevery template in the library. When a single pair of primers (F1 and R1)that are specific to the tag sequences added (middle panel), but not thepair of primers (F3 and R1) which was not included in the tagging (rightpanel) were used, the tagged-templates were amplified at much loweramount compared to the product amplified with universal primer,indicating only a portion ( 1/16) of the library was amplified. Thus,the original library was split to smaller sub-libraries.

Example 4

The sensitivity of cell-based assay for library screening.

Procedures:

DNA constructs: Plasmid DNA constructs containing xpt-G17 riboswitch wasdiluted in DNA construct SR-Mut to different ratio of these two DNAconstructs. The mixed constructs G17 and SR-mut plasmids DNA were thentransfected to HEK 293 cells. Transfection and luciferase assay wereperformed as described in Example 1.

Results:

The sensitivity of cell-based assay for library screening determines howcomplex or how big the size of aptamer-riboswitch plasmid library can bein order for minimum 1 positive hit to stand out from the rest of thelibrary in the screen. The assay can be for luciferase activity,fluorescence intensity of fluorescent protein or growth hormone/cytokinerelease, depending on the reporter gene chosen, and genetic elements canbe delivered either by transient transfection or by viral transduction,e.g. AAV, Adeno Virus, lentivirus etc.

Here, we chose transient transfection to deliver plasmid DNA, and usedfirefly luciferase as reporter gene using xpt-G17 construct as positiveriboswitch control vector, an assay that has been extensively tested andused during the development of xpt-G17 riboswitch in mammalian cells.Construct SR-mut was used as negative control vector which has the samegenetic elements as xpt-G17 construct except that there is no guanineaptamer sequence, therefore does not activate gene expression inresponse to guanine treatment. These two constructs were mixed togetherto mimic a pooled library situation, though the actual riboswitchlibrary is more complex due to the large molecular diversity generatedby nucleotide randomization. Cells transfected with 100% xpt-G17construct DNA yielded 2000-fold induction of luciferase activity upontreatment with 500 μM of guanine when compared to untreated cells. Whenxpt-G17 construct DNA was diluted with SR-mut construct DNA, cellstransfected with the mixed DNA showed lower fold induction of luciferaseactivity. As shown in FIG. 3 , the fold induction decreased when theratio of guanine responding xpt-G17 construct to non-responding negativeSR-mut construct decreased, but still can generate a 2.3-fold inductionwhen there is 1 positive construct out of 2000 molecules, indicating theprobability of recovering 1 ligand-responding riboswitch from a mixtureof ligand-nonresponding riboswitches.

For assays other than the above described one, the sensitivities of theassay should be tested to provide guidance for determining the size ofthe sub-library pools to be screened.

Example 5

Construction of pooled aptamer-based riboswitch plasmid library andsplitting of larger riboswitch library to smaller screenablesub-libraries.

Procedures:

Construction of pooled plasmid library of riboswitches: Ultramer Oligoscontaining aptamer sequences with randomized bases (see Example 2 forsequence design and composition) were PCR amplified using Platinum Pfxkit (Invitrogen) to generate double stranded DNA fragments, and thegenerated PCR product was run on 4% agarose gel. The DNA with 153 bpsize was gel-purified (Qiagen) and digested with BsaI enzyme (NEB). TheBsaI-digested DNA fragment was then ligated to BsaI-digested acceptorvector (mDHFR-Luci-Acceptor) as described in Example 1 with a 1:5 ratioof vector to insert using a T4 DNA ligase (Roche). ElectroMAX DH5α-Ecompetent cells were transformed following the manufacturer'sinstructions (Invitrogen) with the ligation product and plated onto agarplates. Bacterial colonies were pooled and collected, and DNA wasextracted to obtain plasmid library of riboswitches (P1).

A similar approach was used to generate a smaller plasmid riboswitchlibrary (P2) in which nucleotide bases at 5 positions in the aptamerwere randomized generating a total of 1024 different aptamer sequences(where N denotes a randomized position):

(SEQ ID NO: 10) GACTTCGGTCTCATCCAGAGAATGAAAAAAAAATCTTCAGTAGAAGGTAATGTATANNNGCGTGGATATGGCACGCNNGTTTCTACCGGGCACCGTAAATGTCCGACTACATTACGCACCATTCTAAAGAATAACAGTGAAGAGACCAGA CGG

Transformation of chemically competent DH5α: 227 pg of plasmid DNA wasused to transform 50 μl of competent cells to obtain 1:10 ratio ofplasmid DNA and bacterial cells. The transformed cells were plated ontoagar plates after being incubated at 37° C. without shaking for 30minutes, and colonies were pooled and collected for DNA extraction using96-format miniprep kit (Qiagen) to obtain pooled plasmid sub-librariesof riboswitches.

Next Generation Sequencing (NGS): The plasmid DNA from secondary ortertiary riboswitch sub-libraries was used as templates, and thefollowing primers were used to generate PCR amplicons that contain therandomized aptamer sequences: DHFR_F:5′-GACTTCGGTCTCATCCAGAGAATGAAAAAAAAATCTTCAGTAGAAGGTAATG-3′ (SEQ ID NO:11); IVS_R: 5′-CCGTCTGGTCTCTTCACTGTTATTCTTTAGAATGGTGCG-3′ (SEQ ID NO:12). PCR products were subject to NGS using Illumina MiSeq 2×150 bppaired-end platform to generate approximately 700K reads for each sampleand subsequent bioinformatics analysis for unique sequenceidentification and relative abundance calculation (Serevice provided byGenewiz). Sequences that showed 12, or more than 12, reads from asequencing run are considered true sequences.

Results

To screen aptamers by a cell-based assay, a plasmid library ofriboswitches was generated by cloning the aptamer library intomDHFR-Luci-Acceptor vector (FIG. 5 a ). The constructs generated containthe same configuration of genetic element as in construct xpt-G17, withthe only difference being in the aptamer sequences. We started with anaptamer library generated as described in Example 2, a randomizedaptamer library comprising of 10⁶ unique sequences. To ensure greaterthan 99.9% representation of the initial aptamer library, a total of7.5×10⁶ colonies, which is 7.5 times the number of sequences in theaptamer library, were collected from agar plates. The plasmid DNAextracted from the collected colonies forms the plasmid library (P1)consisting of 10⁶ unique riboswitches.

To divide plasmid libraries into sub-libraries that are small enough tobe screened using the developed cell-based assay, a strategy wasutilized, as outlined in FIG. 5 b , involving pooling smaller numbers oftransformed bacterial colonies and extracting DNA to make plasmidsub-libraries of riboswitches. This process of dividing plasmidlibraries can be performed for several rounds to obtain the requiredsize of the sub-libraries in which a single positive event (i.e.,specific aptamer/ligand binding leading to reporter gene expression) canbe detected based on the sensitivity of the cell-based assay developedfor screening the library, generating primary, secondary or tertiarysub-libraries, respectively. The size of sub-libraries was calculated asn(sub-library)=m (fold representation)*N (initial library size)/d(dividing fold). The “dividing fold” represents the total number ofsub-libraries to obtain, and can be any number as desired. Here, wechose 100 as dividing fold for the ease of calculation. For the firstround of dividing, 6×10⁶ colonies were collected, which is 6 times thenumber of riboswitches in the initial plasmid library to obtain greaterthan 99% representation (10⁶). For the second round of dividing, 1-foldrepresentation of the primary sub-library was chosen. For the plasmidlibrary with 10⁶ riboswitches we built (P1), where N=10⁶, m=6, d=100,the size of each individual sub-library is n=6×10⁴. A total of 6×10⁶bacterial colonies were collected into 100 individual tubes and DNAextracted from each individual tube to generate primary plasmidsub-library of riboswitches (P1S_001 through P1S_100). Using the samestrategy and starting with sub-library P1S_001, as an example, theprimary sub-library was further divided into 100 even smaller secondarysub-libraries named P1S_001_001 through P1S_001_100. Thus, by performingtwo rounds of dividing, secondary plasmid sub-libraries were generatedwith 600 riboswitches in each. The sub-libraries of riboswitches can befurther divided by the 3rd round of dividing processes to generatetertiary plasmid sub-libraries.

The same approach was used to divide plasmid riboswitch library P2 thatcontains 1024 unique aptamer sequences. By collecting 100 portions of atotal 5000 colonies, 100 primary sub-libraries P2S_001 to P2S_100 weregenerated, with each sub-library containing approximately 50riboswitches.

To determine the composition and the quality of the above generatedriboswitch libraries, next generation sequencing (NGS) was performed onthe secondary plasmid sub-libraries of riboswitches that presumablycontains 600 riboswitch sequences in each sub-library. Four secondarysub-libraries were selected at random where two of the secondarysub-libraries were generated from the primary sub-library P1S_003, andthe other two secondary sub-libraries were generated from primarysub-libraries P1S_007 and P1S_048, respectively. As shown in FIG. 5 c ,each of the secondary sub-libraries contains approximately 500 or 600unique sequences, consistent with the number of colonies that werecollected for generating secondary sub-libraries. A further analysis ofthe NGS data indicates that between the two secondary sub-libraries(P1S_003_004 and P1S_003_041) that were generated from the same primarysub-library (P1S_003), 39 sequences are contained in both libraries(FIG. 5 d ). When comparing two secondary sub-libraries, P1S_003_004 andP1S_007_021, that were derived from different primary sub-libraries,P1S_003 and P1S_007, only 3 sequences are shared by both sub-libraries(FIG. 5 e ). These results indicate that using the above describedstrategy, plasmid riboswitch sub-libraries were generated with thedesired number of unique sequences that are ready for mammaliancell-based screening.

Example 6

Mammalian cell-based screening for new aptamers against ligands ofchoice.

As described in Example 5, 100 primary plasmid sub-libraries (P1S_001through P1S_100), comprising 60 k riboswitches in each pool, wereconstructed, and 100 secondary plasmid sub-libraries (P1S_001_001 toP1S_001_100) consisting of 600 riboswitches in each were generated byfurther dividing the primary sub-library P1S_001 using the samestrategy. The pooled libraries can be arrayed in 96-well format tofacilitate high-through screening. A preliminary screen was performed,using the luciferase reporter assay as described in Example 1, onprimary sub-libraries P1S_001 to 006 as well as the sub-libraries ofP1S_001, against guanine, which is against the initial aptamer sequence,as the tested ligand. The basal level of luciferase activity generatedby constructs from either primary sub-libraries or secondarysub-libraries varied significantly from that of xpt-G17 construct (datanot shown), suggesting that changes in the aptamer sequence byrandomizing bases at the selected positions impacted theinclusion/exclusion of the stop codon-containing exon to various extent,therefore affecting the basal luciferase expression. Following guaninetreatment, although cells transfected with the 60 k primary sub-libraryP1S_005 generated 1.8-fold induction of luciferase activity incomparison to untreated cells (FIG. 6 a ), more than 2-fold induction ofluciferase was not discovered when using guanine as the ligand. However,7 of the 100 secondary sub-libraries yielded more than 2-fold inductionof luciferase activity upon guanine treatment, with sub-libraryP1S_001_075 generating 7.8-fold induction (FIG. 6 b ). In thesensitivity assay described in Example 4, 6.3-fold induction wasdetected when there was 1 xpt-G17 riboswitch among 500 non-ligandresponding molecules. Based on this sensitivity test, the result (7.8fold) from this preliminary screening of the sub-library P1S_001_075suggests that there is either 1 riboswitch out of 600 that isfunctionally equivalent to xpt-G17, or there are several weakerriboswitches of which the sum of induced luciferase activity iscomparable to that of xpt-G17.

To further demonstrate the applicability of the mammalian cell-basedscreening of the present invention for functional aptamers-containingriboswitches and to discover new aptamers with improved activity inresponding to a desired ligand, the sub-libraries of plasmid riboswitchlibrary P2 were screened in a 96-well format with NAD+. The nucleotidebases at the randomized positions in the xpt-guanine aptamer have beenlinked to riboswitch activity tuning and named tune box (Stoddard, etal. J Mol Biol. 2013 May 27; 425(10):1596-611). Therefore, changes ofnucleotides at these positions potentially generate sequences that havealtered riboswitch activity in response to the ligand treatment. Due tothe nature of guanine and its low applicability in vivo, NAD+ was chosenas ligand for potential new aptamers. This choice of ligand was basedupon the above results from screening the Prestwick compound libraryagainst the parental xpt-G17 riboswitch, and discovering that NAD+ canregulate the guanine riboswitch, generating approximately 40-foldinduction at 100 μM concentration. In an attempt to generate aptamersequences that have improved riboswitch activity against NAD+, wegenerated and screened the sub-libraries of P2 (having changes ofnucleotides at the above-mentioned 5 positions in the aptamer) usingluciferase as reporter gene. As shown in FIG. 6 c , multiplesub-libraries, approximately 50 riboswitches in each, yielded more than10 fold induction of luciferase expression in response to the treatmentof 100 μM NAD+, with one of the sub-libraries, P2S_002, generating 37fold induction, whereas a single xpt-G17 riboswitch construct showed 32fold induction in response to the treatment of NAD+ at sameconcentration.

These screening results indicate that among the approximately 50riboswitches in the sub-libraries that yielded more than 10 foldinduction of luciferase expression, there are riboswitches that canproduce minimally 10 fold induction, assuming all the riboswitches inthe library respond to NAD+ treatment. In the sub-library P2S_002 thatyielded 37 fold induction, which is higher than the fold inductiongenerated by G17, there is at least 1 riboswitch that functions muchbetter than G17. To further prove this, 96 single constructs derivedfrom sub-library P2S_002 were screened. As shown in FIG. 6 d , thoughmultiple constructs lost or produced less induction than G17, a numberof single constructs produced higher fold induction than the G17construct, indicating that nucleotide changes in the tune boxdramatically affect the riboswitch activity in responding to ligandtreatment in cells. Using this approach, we identified a number ofdifferent tune box sequences (as shown in Table 2), with which theriboswitches produced higher fold induction of luciferase than the G17construct upon NAD+ treatment, with multiple aptamer sequences producingmore than 100 fold induction.

TABLE 2 Riboswitches with improved reporter geneexpression in mammalian cells in response to ligand, NAD+. SEQ ID Con-NO: struct Sequence 13 G17 ATAATCGCGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTAAATGTCCGACT 14 #02 ATAACCGCGTGGATATGGCACGCGGGTTTCTACCGGGCACCGTAAATGTCCGACT 15 #16 ATAGCCGCGTGGATATGGCACGCGGGTTTCTACCGGGCACCGTAAATGTCCGACT 16 #17 ATAAGGGCGTGGATATGGCACGCTCGTTTCTACCGGGCACCGTAAATGTCCGACT 17 #21 ATAAATGCGTGGATATGGCACGCATGTTTCTACCGGGCACCGTAAATGTCCGACT 18 #26 ATAAGCGCGTGGATATGGCACGCGCGTTTCTACCGGGCACCGTAAATGTCCGACT 19 #29 ATAGTGGCGTGGATATGGCACGCCAGTTTCTACCGGGCACCGTAAATGTCCGACT 20 #31 ATAAAGGCGTGGATATGGCACGCCGGTTTCTACCGGGCACCGTAAATGTCCGACT 21 #33 ATAGTTGCGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTAAATGTCCGACT 22 #36 ATAGCGGCGTGGATATGGCACGCTGGTTTCTACCGGGCACCGTAAATGTCCGACT 23 #41 ATAATGGCGTGGATATGGCACGCTAGTTTCTACCGGGCACCGTAAATGTCCGACT 24 #46 ATAATTGCGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTAAATGTCCGACT 25 #54 ATAATTGCGTGGATATGGCACGCGAGTTTCTACCGGGCACCGTAAATGTCCGACT 26 #61 ATAATCGCGTGGATATGGCACGCGAGTTTCTACCGGGCACCGTAAATGTCCGACT 27 #69 ATAACTGCGTGGATATGGCACGCGGGTTTCTACCGGGCACCGTAAATGTCCGACT Tune box sequences are underlined.

One of the new constructs, #46, was further tested. As shown in FIG. 6 eand FIG. 6 f , new construct #46 responded to NAD+ treatment in adose-dependent manner and showed superior improvement in the level ofinduced reporter gene expression as well as in the induction fold whencompared with G17 construct. The new constructs also have improved generegulation in response to guanine treatment (data not shown).

Thus, the present invention provides an approach where a relativelylarge riboswitch library can be divided into smaller riboswitchsub-library that is screenable through a mammalian cell-based assay.Moreover, from the riboswitch library, new sequences that have improvedriboswitch activities in mammalian cells were discovered.

REFERENCES

-   1. Mandal, Maumita, Benjamin Boese, Jeffrey E. Barrick, Wade C.    Winkler, and Ronald R. Breaker. “Riboswitches Control Fundamental    Biochemical Pathways in Bacillus Subtilis and Other Bacteria.” Cell    113, no. 5 (May 30, 2003): 577-86.-   2. Mandal, Maumita, and Ronald R. Breaker. “Adenine Riboswitches and    Gene Activation by Disruption of a Transcription Terminator.” Nature    Structural & Molecular Biology 11, no. 1 (January 2004): 29-35.    doi:10.1038/nsmb710.-   3. Mulhbacher, Jerome, and Daniel A. Lafontaine. “Ligand Recognition    Determinants of Guanine Riboswitches.” Nucleic Acids Research 35,    no. 16 (2007): 5568-80. doi:10.1093/nar/gkm572.-   4. Serganov, Alexander, Yu-Ren Yuan, Olga Pikovskaya, Anna    Polonskaia, Lucy Malinina, Anh Tuan Phan, Claudia Hobartner, Ronald    Micura, Ronald R. Breaker, and Dinshaw J. Patel. “Structural Basis    for Discriminative Regulation of Gene Expression by Adenine- and    Guanine-Sensing mRNAs.” Chemistry & Biology 11, no. 12 (December    2004): 1729-41. doi:10.1016/j.chembiol.2004.11.018.-   5. Edwards, Andrea L., and Robert T. Batey. “A Structural Basis for    the Recognition of 2′-deoxyguanosine by the Purine Riboswitch.”    Journal of Molecular Biology 385, no. 3 (Jan. 23, 2009): 938-48.    doi:10.1016/j.jmb.2008.10.074.-   6. Batey, Robert T., Sunny D. Gilbert, and Rebecca K. Montange.    “Structure of a Natural Guanine-Responsive Riboswitch Complexed with    the Metabolite Hypoxanthine.” Nature 432, no. 7015 (Nov. 18, 2004):    411-15. doi:10.1038/nature03037.-   7. Serganov, Alexander, Yu-Ren Yuan, Olga Pikovskaya, Anna    Polonskaia, Lucy Malinina, Anh Tuan Phan, Claudia Hobartner, Ronald    Micura, Ronald R. Breaker, and Dinshaw J. Patel. “Structural Basis    for Discriminative Regulation of Gene Expression by Adenine- and    Guanine-Sensing mRNAs.” Chemistry & Biology 11, no. 12 (December    2004): 1729-41. doi:10.1016/j.chembiol.2004.11.018.-   8. Ellington, A. D., and J. W. Szostak. “In Vitro Selection of RNA    Molecules That Bind Specific Ligands.” Nature 346, no. 6287 (Aug.    30, 1990): 818-22. doi:10.1038/346818a0.-   9. Tuerk, C., and L. Gold. “Systematic Evolution of Ligands by    Exponential Enrichment: RNA Ligands to Bacteriophage T4 DNA    Polymerase.” Science (New York, N.Y.) 249, no. 4968 (Aug. 3, 1990):    505-10.-   10. Ozer, Abdullah, John M. Pagano, and John T. Lis. “New    Technologies Provide Quantum Changes in the Scale, Speed, and    Success of SELEX Methods and Aptamer Characterization.” Molecular    Therapy. Nucleic Acids 3 (2014): e183. doi:10.1038/mtna.2014.34.-   11. Kebschull, Justus M., and Anthony M. Zador. “Sources of    PCR-Induced Distortions in High-Throughput Sequencing Data Sets.”    Nucleic Acids Research 43, no. 21 (Dec. 2, 2015): e143.    doi:10.1093/nar/gkv717.

1.-46. (canceled)
 47. A method for splitting a randomized aptamerlibrary into smaller aptamer sub-libraries comprising the steps: (a)providing a randomized aptamer library, wherein the aptamers in thelibrary comprise multiple 5′ and 3′ constant regions and one or morerandomized nucleotides, (b) performing a two-cycle PCR using therandomized aptamer library as the template and first primers and secondprimers that are complementary to the 5′ and 3′ constant regions, theprimers each including one of a plurality of tag sequences, (c)isolating the products of the two-cycle PCR, and (d) PCR amplifying asubset of the isolated products of the two-cycle PCR using primerscomplementary to a subset of the 5′ and 3′ tag sequences.
 48. The methodof claim 47, wherein the randomized aptamer library comprises more thanabout 100,000 aptamers.
 49. The method of claim 47, wherein therandomized aptamer library comprises more than about 1,000,000 aptamers.50. The method of claim 47, wherein the first or second primer in thetwo-cycle PCR comprises a label selected from the group consisting ofbiotin, digoxigenin (DIG), bromodeoxyuridine (BrdU), fluorophore, and achemical group used in click chemistry.
 51. A method for splitting arandomized aptamer library into smaller aptamer sub-libraries comprisingthe steps: (a) providing a randomized aptamer library, wherein theaptamers in the library comprise multiple 5′ and 3′ constant regions andone or more randomized nucleotides and wherein the aptamers in thelibrary are single-stranded, (b) ligating a plurality of 5′ and 3′ tagsequences to the aptamers in the library using a T4 RNA ligase toproduce 5′ and 3′ tagged ligation products, and (c) PCR amplifying asubset of the 5′ and 3′ tagged ligation products using primerscomplementary to a subset of the 5′ and 3′ tag sequences.
 52. The methodof claim 51, wherein the randomized aptamer library comprises aptamershaving one or more randomized nucleotides.
 53. The method of claim 51,wherein the randomized aptamer library comprises more than about 100,000aptamers.
 54. The method of claim 51, wherein the randomized aptamerlibrary comprises more than about 1,000,000 aptamers.
 55. The method ofclaim 51, wherein the first or second primer in the two-cycle PCRcomprises a label selected from the group consisting of biotin,digoxigenin (DIG), bromodeoxyuridine (BrdU), fluorophore, and a chemicalgroup used in click chemistry.
 56. A method for splitting a randomizedaptamer library into smaller aptamer sub-libraries comprising the steps:(a) providing a randomized aptamer library, wherein the aptamers in thelibrary comprise multiple 5′ and 3′ constant regions and one or morerandomized nucleotides and wherein the aptamers in the library aresingle-stranded, (b) performing a PCR on the aptamers to generate arandomized double-stranded aptamer library, wherein the aptamers in thelibrary comprise multiple 5′ and 3′ constant regions and one or morerandomized nucleotides and wherein the aptamers in the library aredouble-stranded, (c) ligating a plurality of 5′ and 3′ tag sequences tothe aptamers in the double-stranded aptamer library using a T4 RNAligase to produce 5′ and 3′ tagged ligation products, and (d) PCRamplifying a subset of the 5′ and 3′ tagged ligation products usingprimers complementary to a subset of the 5′ and 3′ tag sequences. 57.The method of claim 56, wherein the randomized aptamer library comprisesaptamers having one or more randomized nucleotides.
 58. The method ofclaim 56, wherein the randomized aptamer library comprises more thanabout 100,000 aptamers.
 59. The method of claim 56, wherein therandomized aptamer library comprises more than about 1,000,000 aptamers.60. The method of claim 56, wherein the first or second primer in thetwo-cycle PCR comprises a label selected from the group consisting ofbiotin, digoxigenin (DIG), bromodeoxyuridine (BrdU), fluorophore, and achemical group used in click chemistry.